Implementation of a Parallel Radix Sort on Cell Processor
نویسنده
چکیده
Cell Broadband Engine (CBE) possesses tremendous raw computation power. Nevertheless, developing programs that can fully exploit that computation power is challenging. Cell Speed Challenge 2007 gives its contestants the opportunity to strive for the highest performance of data sorting on Cell. This work describes the design and implementation of a parallel radix sort. Data buffering, scalar code optimizations are employed to make the algorithm work effectively on Cell. Evaluation results show that significant performance can be achieved.
منابع مشابه
Partitioned Parallel Radix Sort
Load balanced parallel radix sort solved the load imbalance problem present in parallel radix sort. By redistributing the keys in each round of radix, each processor has exactly the same number of keys, thereby reducing the overall sorting time. Load balanced radix sort is currently known as the fastest internal sorting method for distributed-memory multiprocessors. However, as the computation ...
متن کاملPARALLEL SORTING AND MOTIF SEARCH By SHIBDAS BANDYOPADHYAY A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PARALLEL SORTING AND MOTIF SEARCH By Shibdas Bandyopadhyay August 2012 Chair: Sartaj Sahni Major: Computer Engineering With the proliferation of multi-core architectures, it has become increasingly important to design versions of popular...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملKeys Per Processor ( n / p ) Radix SortBitonic Sort Sample Sort Simple Radix Sort
We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating the primitive operations that it is capable of performing along with the cost of each operation. Next, we analyze an algorithm by making a precise count of the number of times the algorithm perform...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007